Skip to content

Add must-gather script for diag collection of NIM operator and operands#496

Merged
shivamerla merged 4 commits intoNVIDIA:mainfrom
shivamerla:add_must_gather
May 15, 2025
Merged

Add must-gather script for diag collection of NIM operator and operands#496
shivamerla merged 4 commits intoNVIDIA:mainfrom
shivamerla:add_must_gather

Conversation

@shivamerla
Copy link
Copy Markdown
Collaborator

No description provided.

Signed-off-by: Shiva Krishna, Merla <smerla@nvidia.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 15, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Comment thread hack/must-gather.sh
Signed-off-by: Shiva Krishna, Merla <smerla@nvidia.com>
Copy link
Copy Markdown
Collaborator

@visheshtanksale visheshtanksale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@mkhaas
Copy link
Copy Markdown
Collaborator

mkhaas commented May 15, 2025

it would be good to add PVC and storage class

@mkhaas
Copy link
Copy Markdown
Collaborator

mkhaas commented May 15, 2025

Also, dump ingress, in case configured.

Copy link
Copy Markdown
Collaborator

@visheshtanksale visheshtanksale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!!

Signed-off-by: Shiva Krishna, Merla <smerla@nvidia.com>
@shivamerla
Copy link
Copy Markdown
Collaborator Author

+ K=kubectl
+ kubectl version
++ date +%Y%m%d_%H%M
+ export ARTIFACT_DIR=/tmp/nim-nemo-must-gather_20250515_1832
+ ARTIFACT_DIR=/tmp/nim-nemo-must-gather_20250515_1832
+ mkdir -p /tmp/nim-nemo-must-gather_20250515_1832
+ exec
+ exec
++ tee /tmp/nim-nemo-must-gather_20250515_1832/must-gather.log
Gathering Kubernetes version info
Gathering GPU node status
Gathering GPU node descriptions
Gathering NIM Operator pods from nim-operator
Gathering storage class, PVC and PV information
Gathering PVCs from NIM_NAMESPACE: nemo
Gathering PVCs from NEMO_NAMESPACE: nemo
Gathering NIMPipeline, NIMService and NIMCache CRs from nemo
Gathering ConfigMaps in nemo owned by NIMCache
Gathering NIMService pods from nemo
Gathering Ingress configuration from nemo
Gathering NeMo CRs from nemo
Gathering NeMo microservice pods from nemo
Gathering Ingress configuration from nemo
Must gather logs collected successfully and saved to: /tmp/nim-nemo-must-gather_20250515_1832

Signed-off-by: Shiva Krishna, Merla <smerla@nvidia.com>
@shivamerla shivamerla merged commit 0aa9363 into NVIDIA:main May 15, 2025
7 checks passed
varunrsekar pushed a commit to varunrsekar/k8s-nim-operator that referenced this pull request May 20, 2025
NVIDIA#496)

Signed-off-by: Shiva Krishna, Merla <smerla@nvidia.com>

Add Copyright, Usage guide and dump model manifests

Signed-off-by: Shiva Krishna, Merla <smerla@nvidia.com>

Collect storage information along with ingress

Signed-off-by: Shiva Krishna, Merla <smerla@nvidia.com>

Update usage

Signed-off-by: Shiva Krishna, Merla <smerla@nvidia.com>
varunrsekar pushed a commit that referenced this pull request May 20, 2025
#496)

Signed-off-by: Shiva Krishna, Merla <smerla@nvidia.com>

Add Copyright, Usage guide and dump model manifests

Signed-off-by: Shiva Krishna, Merla <smerla@nvidia.com>

Collect storage information along with ingress

Signed-off-by: Shiva Krishna, Merla <smerla@nvidia.com>

Update usage

Signed-off-by: Shiva Krishna, Merla <smerla@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants